Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] disable imputation on future data #562

Merged
merged 22 commits into from
Oct 9, 2024

Conversation

lschilders
Copy link
Collaborator

In the MissingValuesTransformer, data that had missing "future" values, i.e. features with trailing null values was imputed with the imputation_strategy. This is not desirable since it does not make sense to impute data when it is not in between two known values. Implemented a pre-processing step to remove rows with any trailing null values. Filtered the same rows in the labels accordingly.

@github-actions github-actions bot added the fix Something isn't working label Oct 2, 2024
Copy link
Collaborator

@egordm egordm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea to put it in the missing values transformer, feels like that it was missing a feature indeed.
I added a few comments to handle a few edge cases it thing we might face.

Copy link
Collaborator

@egordm egordm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! It looks much cleaner! I think there is one more thing I missed in previous review and we can merge.

For non_null_feature_names to be passable, it should be added here:
https://github.com/OpenSTEF/openstef/blob/main/openstef/model/model_creator.py#L118

openstef/feature_engineering/missing_values_transformer.py Outdated Show resolved Hide resolved
Copy link

sonarcloud bot commented Oct 8, 2024

@lschilders lschilders requested a review from egordm October 8, 2024 08:08
Copy link
Collaborator

@egordm egordm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, approved!

@lschilders lschilders merged commit 430fa9e into main Oct 9, 2024
6 checks passed
@lschilders lschilders deleted the fix/disable_imputation_on_future_data branch October 9, 2024 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants